Similarity estimators for irregular and age-uncertain time series

نویسنده

  • K. Rehfeld
چکیده

Paleoclimate time series are often irregularly sampled and age uncertain, which is an important technical challenge to overcome for successful reconstruction of past climate variability and dynamics. Visual comparison and interpolation-based linear correlation approaches have been used to infer dependencies from such proxy time series. While the first is subjective, not measurable and not suitable for the comparison of many data sets at a time, the latter introduces interpolation bias, and both face difficulties if the underlying dependencies are nonlinear. In this paper we investigate similarity estimators that could be suitable for the quantitative investigation of dependencies in irregular and age-uncertain time series. We compare the Gaussian-kernel-based cross-correlation (gXCF, Rehfeld et al., 2011) and mutual information (gMI, Rehfeld et al., 2013) against their interpolation-based counterparts and the new event synchronization function (ESF). We test the efficiency of the methods in estimating coupling strength and coupling lag numerically, using ensembles of synthetic stalagmites with short, autocorrelated, linear and nonlinearly coupled proxy time series, and in the application to real stalagmite time series. In the linear test case, coupling strength increases are identified consistently for all estimators, while in the nonlinear test case the correlation-based approaches fail. The lag at which the time series are coupled is identified correctly as the maximum of the similarity functions in around 60–55 % (in the linear case) to 53–42 % (for the nonlinear processes) of the cases when the dating of the synthetic stalagmite is perfectly precise. If the age uncertainty increases beyond 5 % of the time series length, however, the true coupling lag is not identified more often than the others for which the similarity function was estimated. Age uncertainty contributes up to half of the uncertainty in the similarity estimation process. Time series irregularity contributes less, particularly for the adapted Gaussian-kernel-based estimators and the event synchronization function. The introduced link strength concept summarizes the hypothesis test results and balances the individual strengths of the estimators: while gXCF is particularly suitable for short and irregular time series, gMI and the ESF can identify nonlinear dependencies. ESF could, in particular, be suitable to study extreme event dynamics in paleoclimate records. Programs to analyze paleoclimatic time series for significant dependencies are included in a freely available software toolbox.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

Sieve Inference on Possibly Misspecified Semi-nonparametric Time Series Models∗

This paper first establishes the asymptotic normality of plug-in sieve M estimators of possibly irregular functionals of semi-nonparametric time series models. We show that, even when the sieve score process is not a martingale difference, the asymptotic variances of plug-in sieve M estimators of irregular (i.e., slower than root-T estimable) functionals are the same as those for independent da...

متن کامل

A Histogram Kernel Density Estimation Based Similarity Approach for Uncertain Time Series

With the development of information science and the study of uncertainty, the research of traditional time series has met new challenges. This paper chooses the similarity computation of uncertain time series as research target. Firstly, the uncertainty of the time series is studied systematically, and the reasons for the uncertainty are discussed. Then, based on kernel density estimation, the ...

متن کامل

Probabilistic Similarity Search for Uncertain Time Series

A probabilistic similarity query over uncertain data assigns to each uncertain database object o a probability indicating the likelihood that o meets the query predicate. In this paper, we formalize the notion of uncertain time series and introduce two novel and important types of probabilistic range queries over uncertain time series. Furthermore, we propose an original approximate representat...

متن کامل

Modified Maximum Likelihood Estimation in First-Order Autoregressive Moving Average Models with some Non-Normal Residuals

When modeling time series data using autoregressive-moving average processes, it is a common practice to presume that the residuals are normally distributed. However, sometimes we encounter non-normal residuals and asymmetry of data marginal distribution. Despite widespread use of pure autoregressive processes for modeling non-normal time series, the autoregressive-moving average models have le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014